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ABSTRACT 


An idealised model of the free-response judging process is developed, and 
its elements discussed in terms of judging practices in those free-response 
studies published in full between 1964 and 1985. A wide variety of 
occasional^ conflicting judging practices was found, along with valuable 
indications; for further research in this important area. 
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While free-response methodology has been popular in ESP studies over 
recent years, very little research has been directed to the important question 
of how best to judge the correspondence between f ree-response material and the 
target. However, many experimenters have commented on judging issues, or have 
reported relevant analyses or data which, when brought together, may suggest 
strengths and weaknesses in our judging practices, and promising directions 
for future research. 

With these aims in mind, I have examined various aspects of procedure 
which might influence the success of judging, using as a database eighty-five 
free-response studies in which statistical assessment of the results was 
attempted and which were published in full between 1964 and 1987 inclusive, in 
the Journal of Parapsychology . Journal of the American Society for Psychical 
Research . Journal of the Society for Psychical Research . International Journal 
of Parapsychology , and European Journal of Parapsychology . Space constraints 
prevent me from presenting a summary table of these studies and their full 
references, but these can be obtained from me on request. All of the papers 
in these journals ( whether experimental or not), and those appearing in 
Research in Parapsychology during the same period, were searched for 
commentary relevant to free-response judging, as well as other sources where 
appropriate. The survey is in two sections. In the first section, a model of 
an ideal judging process is presented, and its elements discussed in terms of 
their importance in current judging practices. The second section addresses 
the issues of whether percipients or independent judges are best suited to 
perform the complex judging task, and what qualities a judge should have. 
Finally, the findings of the review are discussed with their implications for 
further research. 


THE ELEMENTS OF JUDGING 

The underlying structure of the judging process 

In a free-response ESP experiment, the percipient’s task is to observe 
and report his or her thoughts, imagery, feelings and mental or physical 
experiences, which might relate to a randomly selected target. In 
free-response studies, the targets used are generally fairly complex (they may 
be people, or geographical locations, objects, and so on). The targets may 
have elements (such as colour, the presence or absence of people) which differ 
in their salience for the percipient, and in their frequency of occurrence. 
In addition, targets may be regarded as possessing various broad categories of 
content (such as semantic content, or emotional content), each of which broad 
categories may differ in their salience. The salience of both individual 
elements and categories of content may differ from one percipient to another, 
depending on individual differences. 

Just as free-response targets are complex and varied, so too are the 
mentations reported by percipients. Mentations may be in the form of imagery 
in any sense modality, or merely abstract concepts; the may be vivid, bizarre, 
fleeting, spontaneous, or have other distinguishing characteristics . Content 
of various kinds may be present in them, with varying chance frequencies of 
occurrence. Mentation items may relate in a variety of ways to the target 
material, such as semantically or by association, and to a greater or lesser 
degree. The type of correspondence may vary from percipient to percipient, or 
from mentation to mentation, or both. Certain types of mentation, and certain 
kinds of target-mentation correspondence may be more likely to carry psi 
information than others. 

The function of a free-response judge (in process-oriented research at 
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probability that psi was responsible for any resemblance between the target 
and the mentation (or inversely, the strength of the ESP component on a given 
trial). In the complex situation described above, one way of looking at the 
task of an ideal judge is that he or she should: 

(i) Assign some numerical value in proportion to the degree of 
correspondence between a single mentation item and the target (and, in some 

types of judging, to the controls); ..... 

(ii) increase this value (given a perfect match) in accordance with the 
rarity of occurrence of the mentation item's content in the mentation of all 
percipients in similar experimental conditions (or in the mentation of tha 
particular percipient on other trials, if such data is available), 

(iii) increase this value (given a perfect match) in accordance with the 
rarity of occurrence of the mentation item’s content in the entire 

experimental target pool; , ... +Vwa 

(iv) increase this value in accordance with the likelihood that the 

mentation item, by virtue of its characteristics, is psi-related (e.g., 
whether it was bizarre, vivid, spontaneous, or whatever characteristics, it 

any, are shown by research to mediate ESP) . 

(v) increase this value in accordance with the salience which the 
content of the mentation has for the percipient (e.g. if research shows that 
the presence of people in a target is highly salient to a percipient, then a 
mentation item bearing on the presence or absence of people would be weighted 

relatively heavily); , .. . ^ 

(vi) increase this value in accordance with the likelihood that the type 
of correspondence (semantic, emotional, etc.) between mentation item and 
target carries psi-related information, if such differences in likelihood are 
indicated by research. 

Having thus arrived at a weighted measure of the correspondence between 
each mentation item in a trial and the target (and controls if appropriate), 
the measures may be summed across the trial or otherwise combined to yield the 
ESP score for that trial. Although this procedure resembles an atomistic 
judging procedure most closely in its structure, it can also be thought of as 
an implicit or idealised basis for holistic or coded judging procedures. In 
holistic judging, it is possible to think of the overall rating assigned to 
items in the judging set as a sum of individual mentation ratings weighted as 
appropriate. In coded judging, the decision of whether a given content 

category was present or absent could be regarded as being made according to 
the sum of weighted ratings of relevant mentation items. Further weightings 
could then be assigned to each decision according to the known salience of t 
content category and the rarity of that value of the code in the target pool. 

The importance of elemen ts of judging in the literature . , 

Each of the six elements of judging in various forms has received 
occasional attention either implicitly or overtly in experimental an 
theoretical papers, although very little direct or systematic research has 
been done on this topic. Most opinion about how best to judge free-respo 
material seems to be based on anecdotal observations. While such observations 

may be unreliable, they may also contain useful %hf 

judging which should be investigated empirically. This being so, each of the 
six^ elements of judging is discussed in turn below m the context of 
commentary and experimental results in the literature survey . 


ihltinv/:!* ■ tfsra r/3 r^C7EWilHIlT/n:ft Lnal -J 
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( i ) Assignment of ft numerical value to corr espondence 

Ideally, the value assigned to the correspondence between a mentation 
item and a target should reflect the correspondence in sane objective (and 
hence reliable) way. 16 studies reported in 10 papers in the database 

surveyed used atonistic judging, but in no case was inter judge reliability 
calculated for the allocation of such ratings. In eight of the studies, each 
point on the rating scale was labelled for the use of the^ judges (e.g. , 0 s 
"no correspondence" ) , which practice might be expected to increase inter judge 
reliability. The number of points on the rating scale ranged from two to 
eleven, with a mean of 4.2, and it is possible that the scales at the low end 
of the range may be too constrained to bo sensitive, while those a.t the higher 
end require judges to make more f ine judgements than is appropriate , and so 
may be insensitive in effect because they increase error variance. In this 
latter case of large rating range, inter judge reliability may be reduced. The 
same may be true of holistic rating scales, which ranged from 4 points to 101, 
and which were clearly reported as being labelled in only 14 out of the 52 
studies in which a holistic scale was used. The number of items in the 
judging set may be a factor in determining the appropriate rating scale; in 
the studies surveyed, set size ranged from 2 to 36 items. Any future research 
which addresses the issue of the appropriate rating scale in this task could 
most usefully do so in the context of active training of judges, with 
feedback, in the use of such scales. Boerenkamp (1984) had considerable 
success in training eight independent judges to rate each statement made by a 
"psychic" about a missing person on a fully-labelled four-point scale of 
likelihood that it would apply to anyone in the population. To test the 
reliability of the judges’ ratings, the judges were randomly assigned to two 
groups of four, and the average ratings of each statements were correlated, 
yielding correlations ranging from r s = +0.66 (36df, p<0.01) to r 8 = +0.93 
( 19df , p<0.01 ) . The training consisted of having each judge rate 

independently the first statement in the transcript, followed by a discussion 
among the judges about the differences in their scoring. Then the second 
statement was scored and discussed, and so on, allowing the judges to discover 
why they differed from the group norm and to adjust their rating strategy 
accordingly. Similar training in rating statements for the likelihood that 
they were the product of deductive reasoning, also on a fully-labelled 
four-point scale, yielded similarly respectable inter judge reliabilities, 
ranging from r 8 = +0.66 (72df, p<0.01) to r 8 = +0.95 (20 df,. p<0.01). 
Although no pretraining measures of reliability were taken, the assignment of 
ratings of the likelihood that a statement would be true of a person on a 
fully-labelled three-point scale by two untrained judges in a study by Tart 
and Smith (1968), showed perfect agreement only 49% of the time. The 
reliability of Boerenkamp’ s judges is relatively high compared to that 
generally obtained in free-response judging, and this may be a useful method 
for training the judges in the reliable assignment of ratings ^o 
mentation- target correspondence. 

Maren (1986) discusses the application of artificial intelligence (AI) to 
give measures of the correspondence between target and mentation. However, 
she stresses that the development of appropriate AI systems is at an early 
stage. It seems that for the time being, the best bet for improving the 
reliability of atomistic (and possibly holistic) ratings may be the training 
of judges, with feedback, in the use of fully-labelled scales with a range 
appropriate for the task. 
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(ii) Weighting in accordance with t he rarity of t he mentation item’s conte nt 

The probability of an exact match between a mentation item and an element 
of th? W? is equal to the product of the probability that the mentation 
iLm <Lur on that trial out of all the others, and of the proWlity 
that that target element should be present in that target out of all other 
targets This being so, the rarer the mentation item, the more weight it 
SoSd recSvS. Stanford’s response-bias hypothesis (1967 , o-i^ from a 
different angle, also suggests that rare responses should be relatively 

number of experimenters have instructed judges to attach more 
weight to rare correspondences (e.g. Palmer, Khamashta, & Israelson, 1979, 
Sargent, 1980), the calculation of frequencies of mentation occurrence has 
Sen ’ seldom ^e exceptions are studies by Roll, 1971 , Roil e t al 1973, 

. T t smith (1968). In these studies, statements made by a medi 

^utT n^Lr^f peiie ’were weighted inversely according to the n«her of 
people in the study about whom the medium made the same statement. 

^ ^er worrattempting to calculate norms for 

to take into account a number of factors. The setting may 
important* in the ganzfeld, for example, the white noise often elicits imagery 
about waterfalls, beaches and aeroplanes. Some responses are c°™on 
certain states of -"ess, fcr ^ 

£S3 S U974)"^d r Biud7Sood and Braid (1975) instructed their percipient (who 
Sr did the judging) to attempt to distinguish target-relevant impressions 
f™ those induced by the state itself (in this case, conventional meditation 

'"“^tewell as being dependent on the situation and atateof “ e °! 

the percipients, mentation content frequencies may also vary from percip 
S i^ipient; most experimenters will probably have come across percents 
. ^ repeated testing, always mention one or more specific images which 
who in ^heir trials, while in contrast, Sargent, Bartlett and Moss 

terget^^pcrcj^en^ y in n< a 1, study 3r in^whiclT^eo^aphical l-atiSn ^Jhe target 
^te'mSrSunll to talk about trees than a percipient in a study in which 
aspects of a person is the target. 

<»ii mlMk, in accordance with the rarity of the mentation item in target 

^ M stated in (ii) above, the probability of an exact match between a 
As stated in v / » decreases as the probability of 

mentation item and an element of the target decreases ^ Therefore, 

£■ ZZZTZSSXSZZ rtirg^^-X-it Should be 

Dunne and Jahn (1980) calculated the a priori P^^f e ° f ^ 
values of their thirty binary descriptors in „ er e absent, 

CSS etc , to = te the h^vier 

KS32 a°U£ r 'ZSTS ZSTS* 

nor weighted. 
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ilXi — Mentation characteristics 

categories^ iT’STSS £ **? ™°h 

Bennet (1982) found tb7rt» «4 to "* and 

significantly above chance, aTtho u g”scoA„^n th °f bizarre" mentation uaa 
was compared to the theoretical S bl2arre i"agery 

basis of other imagery. Milton fl984) found significant hr ++**** ° n the 
basis of "surprising" imagery and 1 C Tq 7 f Psi-hitting on the 
below-chance scoring on "fleeting" imna r- , . found significantly 

two independent judges. A thir^stJb^W Milton mayT ^ . r ® t ‘ ults of one of 
of mentation categories found „ Sl^" 1 a “ ida 

itema^’dS^^riZ <1977 > had J^dge place mentation 

a second jSgTcoZre eS ° f /^enteen categories, and haS 

"telepathy presenter "telepathTafeen^ f °5 that nlgh t and mark it 

imagery, hypnagogic” “ T" 

r„ elatorati °"> *• ^at^SS 

imagery being a negative one t • degree - the association with waking 

the target "fi ofSn tZ oh™ i,naSery ““«> to mediated wi!£ 

experiment-related, hoatilitj^s^tBerie^t^uhr of " sef/'S, 1 !5’ d ? ty ' 

tS el ^g non - 3l r Sio “ t £ 

expectations: he or she mitfht hnvo s' * e< 7 t ^ least in part the judges’ 

present for a mentation item which r"' 01 ' 6 . lnc l ined bo consider telepathy 
expected to be suJSsm. 11 lnt ° a oategor >' "»i<* he or she 

family tittle Srgff ^1" )S?tf rkeS ^ in . 'h 985 ’ perol P 1 “‘ t <>- already 

day and later selected th.Tdhy*s target I *ahdng t f lkln ?s e!IPer : en0ea durins <*0 
these experiences In otyW^ d ff lng from the P° o1 on the basis of 

judging task, Merkestein selector indent 

sssss szszjrzsz 0 r£ rl r ^—s^ihs 

t0 th In m ^itior e trdlr" e r n0t aignifiomtl y tahret- 1 SttS* rlen0e3 

anecdotal observations 1 ^^^ 1 ^^™’ ° ffered 

psi -related than others. Remrli no nn mentatlon appears to be more 

apparently successful percipients nnH 1D ® rmai research discussion among 

- he partSU^ f^r"^ i^rf Sh «"* 

recurrent tended to be psi-related mH +w T . fleeting| novel > or 

kinaeathetic, auditory ^and "^’^ i™ 1 impressions, including 
importance compared to vitumi y „ 8 ’ were of equal or greater 

that memory images seemed to bTtuccessful^n 0 ^ + i? 74) observed 

Bisaha (1979) commented that logical inference ganzfeld study; Dunne and 
were unhelpful. In a • n * ej ~ences from an initial impression 

ISca^dS^ r r~’ ^*^53 

^es in 
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to pay # ore at ter^ i on to " 
ongoing train of thought" (p.138). 

Other experimenters have instructed judges to pay more attention to 
particular mentation categories. Thus Sargent (1980) instructed judges to pay 
more attention to mentations which the percipient reports as being novel , 
striking, odd, unusual, unexpected or particularly clear, and to pay less 
attention to mentations clearly linked with an immediate memory (thus not 
conflicting with the comment of Honorton and Harper (1974) above, which 
presumably relates mostly to long-term memories). The criteria for deciding 
whether or not one of Honorton* s (1975) binary content categories is present 
in the mentation include "intensity" and persistence of content-related 

mentation. „ , , . . . , , 

It can be seen that a number of experimenters feel that certain mentation 

types may be more likely to be psi-related than others, although authors vary 
in their choices, and few seem to have based their opinions on formal research 
findings. This would seem to be another aspect of the judging process which 
would benefit from systematic, direct research, with anecdotal observations as 
a valuable starting point. 

(v) Salient aspects of targets 

In an ideal judging situation, those elements of the target material 
which are most salient for the percipient should be more heavily weighted in 
the judging than elements known not to be salient. For the purposes of this 
discussion, ’salient’ describes an element of the target about which the 
percipient tends to give accurate information more often than chance. Thus if 
percipients were very of ten correct about whether or not people were present 
in a pictorial target, then mentation items dealing with the presence or 
absence of people should be more heavily weighted than other mentation i^ems. 

Roll et all (1973) applied such a weighting to mentation categories 
according to their content, made by a sensitive, and meant to apply to various 
people. The content categories were those of physical description, health, 
vocation/education, family, love life, future, wants, in ^ e ^ e ®^ s * ’ 
personal characteristics, and other, and mentation items of half the data 
were weighted in accordance with the success of mentation items in those 

content categories in the other half of the data. 

The content categories used by Roll et al were presumably chosen since 
most of the sensitive’s mentation could conveniently be coded in terms of 
them, rather than because each of these categories was believed to be highly 
salient; indeed, the study was partly one of salience. However, in studies 
where mentation and targets are coded in terms of content categories (e.g. 
Honorton, 1975; Jahn, Dunne and Jahn, 1980), content categories seem to be 
chosen not for their salience but for similar pragmatic reasons of allowing a 
fairly full description of the mentation report. Further research identifying 
salient content categories, to allow them to be weighted appropriately or used 
as the basis of coding systems, would be useful. 


AVI ) buires W , ... i J 

A number of authors have discussed ways in which mentations have appear 
to relate to targets in their studies, and some authors instruct their 
to watch out for some of these correspondences. Those mentioned have included 
literal, formal (shape), sensory (colour, material), symbol lc/metaphoncal , 
associational , emotional and functional correspondences, and it has been 
suggested that these correspondences may relate to either parts of or th 


images that were unique in the general context of 
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target, or to the whole, or both. However, authors differ, and sometimes 
conflict, in the importance they attach to these correspondence types. Some 
authors only take into account one or two types of correspondence, while 
others deal with most of them but weight heavily certain types which other 
authors feel are unimportant. 

For example, Sargent, Bartlett and Moss (1982) in their judging 
instructions attach most importance to direct (presumably literal) 
correspondences, and then consider formal, associative, symbolic, and 
mood/emotive correspondences in order of decreasing importance (a similar 
order of importance is reflected in Sargent’s (1982) judging instructions). 
These instructions conflict with the opinions of several other researchers, 
such as Dunne and Bisaha (1979) who consider that correspondences of shape, 
colour, size, and relation to other shapes, and metaphorical correspondences 
are more likely (and presumably more important) than literal correspondences. 
Similarly, Targ and Puthoff (1978) feel that correspondences of shape, form, 
colour and material are likely to be more accurate than correspondences of 
function or name; Schlitz and Haight (1984) instructed their judges to expect 
correspondences of shape or association rather than literal correspondences; 
Gelade and Harvie (1975) commented that accurate descriptions were rare, and 
that metaphorical and symbolic correspondences were more frequent; Heame 
(1986) instructed independent judges to look particularly for symbolic 
correspondences, and Stanford (1979) used artists as judges on the basis of 
comments by other researchers indicating that meaning was often distorted in 
mentation but hat the form of the target was often described correctly. Other 
researchers, while instructing their judges about the types of possible 
correspondence have either urged their judges to give equal weight to all 
types, or have given instructions in which no type of correspondence was made 
to seem more important than any other (Moriarty and Murphy, 1967; Musso and 
Granero, 1973; Palmer, Khamashta and Israelson, 1979; Palmer, Bogart, Jones 
and Tart, 1977; and Wood, Kirk and Braud, 1977). 

These differences among authors could be due to a number of different 
factors. Firstly, the type of correspondence thought to be most important in 
judging may relate to the percipient’s mode of response, which tends to vary 
from study to study. Those experimenters who encourage their percipients to 
make drawings of their imagery have more opportunity to note correspondences 
(real or spurious) of form than meaning, while the reverse applies to those 
who encourage their percipients to make verbal responses. This may account 
for Sargent’s preference of meaning over form in his ganzfeld studies in which 
percipients make mostly verbal responses, in contrast to the preference of 
form over meaning in the studies of, for example, Moriarty and Murphy (1967) 
and Musso and Granero (1973) which were both picture drawing studies. 

A second factor in differences among authors may relate to individual 
differences between authors themselves, or between the percipients in those 
authors’ studies. Hearne (1986) emphasised symbolic correspondences in his 
instructions to judges because the single percipient in that study seemed to 
have obtained such correspondences in earlier testing. Ullman (1966), 
discussing work on field dependence by Witkin (1965), suggested that the type 
of correspondence in each percipient’s mentation might reflect whether the 
percipient is field dependent, with field dependent percipients yielding 
symbolic correspondences. In addition, in a study with both types of 
correspondence, the types of correspondence noted by the experimenter may 
depend in part on whether he or she is field dependent. The use of differed 
types of target material, may also result in different kinds of 
correspondence; for example, abstract art prints may yield correspondences of 


Approved For Release 2000/08/15 : CI$kDP96-00792R000701 020002-6 


Approved For Release 2000/08/1 5 : CIA-RDP96-00792R000701 020002-6 

fon, and sensory qualities, while pictorial representations of archetypes 
(such as those used by Gertz, 1983) ' ‘ 

correspondences . 


may tend to yield symbolic 


PERCIPIENT JUDGES VERSUS INDEPENDENT JUDGES 

So far, I have discussed the steps to be taken in an idealised judging 
process. A related issue is that of who is most likely to be suited to such a 

^TrSaSf di r U33l0n “ ^ Mature on tifis istue ^ce^S on 

the relative merits of percipients as judges of their own material, and of 

58 d 2rof er S,e JU 98 e ^ faCt .. tha , t + at K least one independent judge was used in 

58.2% of the 98 studies in the database in which the use of an independent 

jSes appropriate may indicate a preference for independent 

Several reasons have been put forward for why independent judges should 
F P r e f f rre <l. First, the use of independent judges should give a uniformity 
of judging criteria across trials which may be lacking when percipients judge, 
resulting in reduced error variance with independent judges (Palmer, Bogart 
Jones and Tart 1977) . Second, it should be easier to^lect or train 

g °v? Judges than to select numerous experimental participants who 

Percipients and good judges (Palmer, Bogart, Jones and Tart, 
1977). Third, the use of percipients as judges is likely to confound their 

P^fp™^ 06 W1 J h 4 t . heir ability ,, such that relationships between 

their ESP score and other variables may be partly with their judging ability 

™ her than thei r ESP; for example, a correlation between extraversion and the 
ESP measure may be due the extravert percipients judging more carefully to 
t S e e ^P erimenter and hence increasing their ESP score (Stanford 1978, 
Fourth, the use of independent judges means that the percipient need 
only be shown the target at the end of the trial, which some experimenters 
fee! may reduce the risk of precognitive displacement (Palmer, Bogart, Jones 
and Tart, 1977; Irwin, 1982). Fifth, independent judges are less likely to be 
ego involved m the trial’s outcome than the percipients since it is not their 
personal chance to demonstrate ESP in from of others, and may therefore be 
less likely to use such strategies as "going for broke" (i.e. artificially 
increasing the correspondence rating of a picture once they are sure it is the 
target, to make it look like a "better" hit) (Stanford and Sargent, 1983) or 
of deliberately avoiding giving points to a target which is a personal 
favorite (Sargent, 1980), although Stanford (1984) suggests avoiding telling 
independent judges that they are assessing ESP data in order to reduce the 
temptation for them also to "go for broke". Sixth, experienced independent 
judges may be more familiar with norms for free- response mentation and may be 
able to identify and hence weight appropriately mentation items which are 
unusual better than naive percipients. In a study by Sargent, Bartlett and 
Moss (1982), an independent judge who separated naive percipients’ mentations 
into unusual and common mentations, obtained less of a scoring difference 
between the two than did the percipients who also categorised their own 
mentation. However, the judge also obtained lower scoring than the 
percipients in both categories, indicating that the judge may have been 
handicapped in the judging task (for example, by the percipients’ inability to 
describe their imagery) . 

For reasons similar to those for avoiding percipient judging, several 
experimenters have explicitly recommended combining scores from several 
independent judges to dilute the effect of idiosyncrasies of each judge, such 
ns the ability to only detect certain types of correspondence (Stanford, 1984) 
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inn“^ items * lch ^ 

judges working in consensus (e.g Targ and t«t iqp« t ex P6rimenters have 
1980), presumably for these r^so ns liSd^f tZV 

least one independent judge was used onlv 90 ej ■, tudies m which at 
used ranged from one tonight How^er tL ^ ^ ° nly ° ne Jud * e; the dumber 
multiple or etherise, d£5£ ^TSL EL?E?S5£ 

as a result of training, or Hue tu . . judges, whether naturally, 
instructions (Stanford, 1984). provision of full and appropriate 

stressed iTle^iSiL^n^ gt>od , Judges good j>*feinS has been 

<»ly *o s^udts ^“su^^eWufre ,9Wi 198 °' *«!! 

judges of ’ . ™Ting ^kgrS! ^Bo^S u S Pa 7, 9 m JU S^! S “ 11 * « 
psychotherapist indeoendent uougai (1987) found that a 

knowledge of subliminal perception re lth considerable experience in and 
(mean Z = *0.187, “!l) “ ghly -*°™ <*a£e 

{mean Z = +0.127, ns) A third P°et scored slightly above chance 

significantly below ch'aLce (mSn^ 0«^a »ho , «s a 'trained "psychic- scored 
is difficult to interpret sinee I t - 2.155, p = 0.04). This result 

"really” scoring above or 'below chance ° ° W ether the Percipients were 

(^TZ^f^Tat^Tg^orC^r 88 *, ^rence abstract, Keeling 
who judged the percipients** data obtain^ mieal pgychoiogy graduate students 
(p= 0 . 018 , one-tailed)? vJdle ateve ~ cba ™e scoring 

psychology class, and a group of twenty undergraduates in an introductory 
on the occult acting ai SL vieS?? ^ StUdentS in a ^ourZ 

results of the Je ^ ^ S ^ "f Ults - H ° Wever ' *** 

students- judged different^ data fL ^e othT? ’ SinCe the °° cult 
undergraduate psychology students did ih • ^ th two ^UP 8 * and the 
the other two groups students dld the judging in a different order from 

the judges' backgromT^d ex^Se ^oul^seem tW ° studies ' 

xn any free-response study. However it *** “ imp ° rtant variable 

studies using independent judges that the i ^ ln ° nly 10 out of 57 

judging ( in areas of psychologv the li ? i had ex P erie nce relevant to 

specifically with the^Lnsf oration I ? ixterary arts, etc. which deal 
experience of SUbC ° n8C1 ° U8 info ™ation, or previous 

However, ^ °f !“ -ve been 

with experience of the ganzfeld gave ^ Tart (1977) that a judge 

ganzfeld study while a second judge witfi^^VT;? 6 ” 0 ^ ° f dis P laceme ut in a 
suggest that experience of the^perimtntel ^e^ eXperiance did not, might 
usefully be included in any iudge’? t?«^ rt pr ° cedu i re used m a study might 
(1987), in which judges' sco^incre»sS ^ Results from a study by Maher 
material presented in a different forme t W1 y. . 'T e P eated Judging of the same 
repeats exposure to the judging task 0 ?°- ime ’. may suggeat that simple 
judging material, may improve seeing. ’ “basing familiarity with the 

topic^lfhSh * neglected 

independent judges with fSn ° han ° e (Mean Z = -0.34), while two 

+0.37), the difference being SST above ohMK » («=» Z = 

p<0,001 , two-tailed). TOey hi t^ ” 1 iVl ^ oxon T = 15, CR = -3.36, 

iney had two more judges judge the data without 

Approved For Release 2000/08/15 : CIA-RDP96-00792R000701 020002-6 


Approved For Release 2000/08/1 5 : CIA-RDP96-00792R000701 020002-6 

instructions, yielding a mean Z-score of +0,29, and concluded that the lack of 
juogmg instructions in this case probably did not cause the difference 

hJwf 1 H he r6SUl S ° f the Percipients and the original two independent 
judges. However, the two uninstructed judges had taken part in a discussion 
ol the judging of free-response material in Palmer’s graduate class in 
parapsychology some months earlier, and so were not entirely naive. 

Jr^wl^L T e rep ° rted 88 ***** given te Judges without judging experience 
or knowledge of unconscious processes in only 12 out of 47 studies in which 

such judges were used. Further research is clearly needed on this topic. 

The only reason against using independent judges has been that only the 
percipients themselves can have full knowledge of what their imagery really 

^s a ^nhbT ld *1+ f t0 rftC 1 ° gnise Phonal symbolism (e.g. Palmer, 1986). 
mis problem might al-so result m confounding the percipient’s ability or 
inclination to fully report their imagery with their ESP performance if 
independent judging were used, possibly resulting in misleading relationships 

exnlored^thf V&ri£ * leS Stanford, 1984). A number of experimenters haS 
explored the importance of asking percipients for more information about their 
mentation after the end of the free-response period, by comparing the 
performance of independent judges provided with transcripts of the initial 
mentation reports, and with the initial transcripts plus additional 
information provided by the percipients. 

Stanford (1984) has suggested training percipients in the reporting of 
imagery, while Palmer, Bogart, Jones and Tart - (1977) suggest having an 
experimenter who is blind to the identity of the target, review the 
Pe ^ C1P J 1 f n ^ 8 experience with him or her immediately after the response period 
an add to the transcript possibly relevant information (such as a full 
description of certain images, or the unusual qualities of images, 
phenomenological characteristics, and so on). Along these lines, it may also 
be advisable to offer percipients the opportunity to draw imagery which may 
have been difficult to describe verbally, or vice-versa, depending on the 

C2tSr£ • 


Sondow (1979) found that independent judging by two experienced judges of 
the initial transcripts only from participants in a ganzfeld study yielded 

+K° ri !S-f XaCtl * a u chance (15 dir ect hits in 60 trials), while judging with 
the addition of the percipients’ personal . associations to the mentation gave 
significantly above-chance scoring (23 hits, Z=2.39, p<0.02). Each judge 
judged half of the initial transcripts only, and half of the transcripts with 
associations, so that no judge judged the same trial with and without 

/^•f 10nS M However, the percipients’ judging yielded even higher scoring 
( 30 hits m 60 trials ) . 

In a dream study by Krippner, Honorton and Ullman (1972), independent 
judges judged first the initial mentation transcript alone, and then the 
transcript plus the results of an interview in which the percipient gave 
details of what mood accompanied the dream, what thoughts or memories it 
brought to mind, what elements of the dream made, no sense in terms of the 
dreamer s personal life, and what the main them of the night’s dreams had 
been. On the initial transcripts alone, the judges obtained two hits out of 
eight trials (with a one in eight chance of success). With the addition of 
the details of the interview, the judges obtained five hits, a result which 
was significantly above chance (p=0.0012, one-tailed). The percipient did not 
do any judging in this study, so no comparison with his scores can be made. 

{ similar procedure was used in a study by Ullman, Krippner and Feldstein 
(1966) . In the interview, the percipient was asked what the dream reminded 
nun or her of, what if anything seemed to be trying to intrude on the dream, 
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and whether there was anything in the dream which was different from the 
percipient’s dreams, such as colour, feeling the dream to be real, or private 
symbolism. On the basis of the initial transcripts alone, the three judges 
(whose judging experience, if any, was not reported) scored significantly 
above chance (F=8.30, p<0.01); with the addition of the interview material 
scoring was even higher (F=18.14, pCO.GOl). The percipient judging yielded 
non-significant results with the initial transcript alone, and results above 
chance at the p<0.05 level (F=4.41) with the addition of the interview 
material . 

On the basis-^of these results, it seems that further elaboration by 
percipients on their initial mentation reports adds useful information, since 
scores with such elaboration were higher than those without in all three 
studies discussed above. However, while the percipients still managed to 
score at a higher level than the experienced judges in Sondow's study even 
when the judges were provided with their associations, Ullman, Krippner and 
Feldstein found that their (possibly inexperienced) independent judges scored 
higher than the percipient judges both with and without associations. This 
apparent conflict of findings may be in part due to the extra information 
which Ullman et al. elicited from their percipients during the interview. 
Clearly, more research needs to be directed to this question. 

DISCUSSION 

The most striking feature about judging practices in the literature 
surveyed is their variety, and in some aspects of judging, their 
contradiction. The level of description of aspects of judging is generally 
very brief, and it may be that judging practices are much more, similar from 
laboratory to laboratory than appears in print. Similarly, the 4% of studies 
using independent judges which involved giving the judges full instructions 
concerning various types of transformation types along with detailed examples, 
may be an underestimate, since more experimenters may have given their judges 
equally full instructions without reporting it. However, either a lack of 
instructions or a lack of reporting them might imply a lack of importance 
being attached to the judging process within the field. Since judging is 
logically a crucial part of any free-response study, both more attention to 
judging and its reporting is surely merited. Delanoy (1987) has listed 
information about judging which should ideally be listed in any free-response 
study. 

Although little direct research has been done on the judging process, the 
studies surveyed indicated many potentially profitable lines of research. 
The training of judges (real training with feedback, rather than merely 
repeated exposure to the judging task) has apparently not be explored, and may 
a valuable research strategy in this area. Awareness of individual 
differences, methods of responding (verbal, pictorial, etc.), setting and 
target type are among the many variables which need to be considered in 
further research on judging, as well as aspects of procedure such as the use 
of rating scales with appropriate ranges and judging sets of an appropriate 
size for the task. We clearly need to know more about all aspects of judging 
as part of our efforts to improve the reliability and effectiveness of 
free-response experimentation in general. 
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